Skip to content

Add fixed effects and absorb parameters to DifferenceInDifferences#2

Merged
igerber merged 1 commit intomainfrom
claude/init-did-library-pvNmf
Jan 1, 2026
Merged

Add fixed effects and absorb parameters to DifferenceInDifferences#2
igerber merged 1 commit intomainfrom
claude/init-did-library-pvNmf

Conversation

@igerber
Copy link
Copy Markdown
Owner

@igerber igerber commented Jan 1, 2026

  • Add fixed_effects parameter for low-dimensional categorical FE (dummy variables)
  • Add absorb parameter for high-dimensional FE (within-transformation)
  • Properly adjust degrees of freedom for absorbed fixed effects
  • Add comprehensive test suite for fixed effects functionality (8 new tests)
  • Update README with fixed effects usage examples and API documentation

- Add fixed_effects parameter for low-dimensional categorical FE (dummy variables)
- Add absorb parameter for high-dimensional FE (within-transformation)
- Properly adjust degrees of freedom for absorbed fixed effects
- Add comprehensive test suite for fixed effects functionality (8 new tests)
- Update README with fixed effects usage examples and API documentation
@igerber igerber merged commit 860f8c8 into main Jan 1, 2026
igerber added a commit that referenced this pull request Apr 16, 2026
- P1 #1: _compute_heterogeneity_test now accepts obs_survey_info and
  runs survey-aware WLS + Binder TSL IF when survey_design is active.
  Point estimate via solve_ols(weights=W_elig, weight_type='pweight');
  group-level IF ψ_g[X] = inv(X'WX)[1,:] @ x_g * W_g * r_g, expanded
  to obs-level via w_i/W_g ratio, then compute_survey_if_variance for
  stratified/PSU variance. safe_inference uses df_survey.
  Rank-deficiency short-circuits to NaN to avoid point-estimate/IF
  mismatch between solve_ols's R-style drop and pinv's minimum-norm.
- P1 #2: twowayfeweights() now accepts Optional[SurveyDesign]. When
  provided, resolves weights via _resolve_survey_for_fit and passes
  them to _validate_and_aggregate_to_cells, restoring fit-vs-helper
  parity under survey-backed inputs. fweight/aweight rejected.
- P3: REGISTRY updates — TWFE parity sentence now includes survey;
  heterogeneity Note documents the TSL IF mechanics and library
  extension disclaimer; checklist line-651 lists survey-aware
  surfaces; new survey+bootstrap-fallback Note after line 652.
- P2: 5 new regression tests in test_survey_dcdh.py:
  TestSurveyHeterogeneity (uniform-weights match, non-uniform beta
  change, t-dist df_survey) and TestSurveyTWFEParity (fit-vs-helper
  match, non-pweight rejection).

All 254 targeted tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
igerber added a commit that referenced this pull request Apr 16, 2026
- P1 #1: _compute_twfe_diagnostic now uses cell_weight (w_gt when
  available, else n_gt) for FE regressions, the normalization
  denominator, contribution weights, and the Corollary 1 observation
  shares. On survey-backed inputs the outputs now match the
  observation-level pweighted TWFE estimand; non-survey path is
  byte-identical.
- P1 #2: Zero-weight rows are dropped before the groupby in
  _validate_and_aggregate_to_cells when weights are provided, so that
  d_min/d_max/n_gt reflect the effective sample. Prevents zero-weight
  subpopulation rows from tripping the fuzzy-DiD guard or inflating
  downstream n_gt counts.
- P2: 2 new regression tests in test_survey_dcdh.py —
  TestSurveyTWFEOracle.test_survey_twfe_matches_obs_level_pweighted_ols
  verifies beta_fe matches an observation-level pweighted OLS under
  survey (would fail if n_gt was still used), and
  TestZeroWeightSubpopulation.test_mixed_zero_weight_row_excluded_from_validation
  verifies an injected zero-weight row with opposite treatment value
  doesn't trip the within-cell constancy check.

All 256 targeted tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
igerber added a commit that referenced this pull request Apr 17, 2026
- P1 #1: _compute_heterogeneity_test now accepts obs_survey_info and
  runs survey-aware WLS + Binder TSL IF when survey_design is active.
  Point estimate via solve_ols(weights=W_elig, weight_type='pweight');
  group-level IF ψ_g[X] = inv(X'WX)[1,:] @ x_g * W_g * r_g, expanded
  to obs-level via w_i/W_g ratio, then compute_survey_if_variance for
  stratified/PSU variance. safe_inference uses df_survey.
  Rank-deficiency short-circuits to NaN to avoid point-estimate/IF
  mismatch between solve_ols's R-style drop and pinv's minimum-norm.
- P1 #2: twowayfeweights() now accepts Optional[SurveyDesign]. When
  provided, resolves weights via _resolve_survey_for_fit and passes
  them to _validate_and_aggregate_to_cells, restoring fit-vs-helper
  parity under survey-backed inputs. fweight/aweight rejected.
- P3: REGISTRY updates — TWFE parity sentence now includes survey;
  heterogeneity Note documents the TSL IF mechanics and library
  extension disclaimer; checklist line-651 lists survey-aware
  surfaces; new survey+bootstrap-fallback Note after line 652.
- P2: 5 new regression tests in test_survey_dcdh.py:
  TestSurveyHeterogeneity (uniform-weights match, non-uniform beta
  change, t-dist df_survey) and TestSurveyTWFEParity (fit-vs-helper
  match, non-pweight rejection).

All 254 targeted tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
igerber added a commit that referenced this pull request Apr 17, 2026
- P1 #1: _compute_twfe_diagnostic now uses cell_weight (w_gt when
  available, else n_gt) for FE regressions, the normalization
  denominator, contribution weights, and the Corollary 1 observation
  shares. On survey-backed inputs the outputs now match the
  observation-level pweighted TWFE estimand; non-survey path is
  byte-identical.
- P1 #2: Zero-weight rows are dropped before the groupby in
  _validate_and_aggregate_to_cells when weights are provided, so that
  d_min/d_max/n_gt reflect the effective sample. Prevents zero-weight
  subpopulation rows from tripping the fuzzy-DiD guard or inflating
  downstream n_gt counts.
- P2: 2 new regression tests in test_survey_dcdh.py —
  TestSurveyTWFEOracle.test_survey_twfe_matches_obs_level_pweighted_ols
  verifies beta_fe matches an observation-level pweighted OLS under
  survey (would fail if n_gt was still used), and
  TestZeroWeightSubpopulation.test_mixed_zero_weight_row_excluded_from_validation
  verifies an injected zero-weight row with opposite treatment value
  doesn't trip the within-cell constancy check.

All 256 targeted tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
igerber added a commit that referenced this pull request Apr 17, 2026
- P1 #1/#2: Add _validate_group_constant_strata_psu() helper and call
  it from fit() after the weight_type/replicate-weights checks. The
  dCDH IF expansion psi_i = U[g] * (w_i / W_g) treats each group as
  the effective sampling unit; when strata or PSU vary within group it
  silently spreads horizon-specific IF mass across observations in
  different PSUs, contaminating the stratified-PSU variance. Walk back
  the overstated claim at the old line 669 comment to match. Within-
  group-varying weights remain supported.
- P1 #3: _survey_se_from_group_if now filters zero-weight rows before
  np.unique/np.bincount so NaN / non-comparable group IDs on excluded
  subpopulation rows cannot crash SE factorization. psi stays full-
  length with zeros in excluded positions to preserve alignment with
  resolved.strata / resolved.psu inside compute_survey_if_variance.
- REGISTRY.md line 652 Note updated: explicitly states the
  within-group-constant strata/PSU requirement and the
  within-group-varying weights support.
- Tests: new TestSurveyWithinGroupValidation class (4 tests — rejects
  varying PSU, rejects varying strata, accepts varying weights, and
  ignores zero-weight rows during the constancy check) plus
  TestZeroWeightSubpopulation.test_zero_weight_row_with_nan_group_id.

All 268 targeted tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants